1 Mount Hood Environmental, PO Box 1303, Challis, Idaho, 83226, USA
2 Mount Hood Environmental, 39085 Pioneer Boulevard #100 Mezzanine, Sandy, Oregon, 97055, USA
3 Mount Hood Environmental, PO Box 4282, McCall, Idaho, 83638, USA
✉ Correspondence: Bryce N. Oldemeyer <Bryce.Oldemeyer@mounthoodenvironmental.com>, Mark Roes <mark.roes@mthoodenvironmental.com>
Habitat covariates recently used in six quantile random forest (QRF) capacity models (Chinook salmon and steelhead; summer parr, winter presmolts and redds) were chosen because of their high predictive power to estimate capacity across the Columbia River Basin (cite IRA?). However, a subset of the covariates included the QRF models were not necessarily useful for restoration project monitoring, or to describe target conditions for restoration design due to the habitat covariate not being easily manipulated by project actions. Additionally, some of the CHaMP covariates used in previous models were difficult to replicate or measure using streamlined fish habitat protocols (DASH - Carmichael et al. 2019). To increase the utility of the QRF model for project monitoring/design and future data collection efforts, we explored alternative covariates to include in the QRF models that: 1) maintained high predictive power, 2) were informative for restoration efforts and monitoring, 3) could be calculated from DASH surveys, 4) were not missing an overabundance of data, and 5) were not highly correlated with other covariates in the models. Using this criterion, we developed six modified QRF model that were more informative for restoration design and monitoring, included covariates that could be calculated using newly developed stream habitat protocols, and maintained a similar level of predictive power as the original QRF habitat capacity models.
Similarly, a random forest extrapolation model has been used to predict capacity estimates across larger scales where CHaMP or DASH data are absent (cite IRA). We revisited the globally available attributes (GAA) included in the original random forest extrapolation model and made minor modifications so GAA’s included in the extrapolation model better aligned with the modified QRF model covariates.
Below is a brief document outlining these efforts, as well as a comparison of extrapolation estimates from the original and modified QRF models for eight watersheds in the Upper Salmon River region.
The QRF model was fit using a revised covariate selection process that placed more emphasis on compatibility with future data collection via DASH and the ability to predict restoration effects. Habitat data collected by CHaMP and other sources (e.g. NorWeST stream temperature) were subset to a list of potential covariates that could be reproduced using DASH data collection. This provides opportunity to collect new paired fish and habitat data using DASH protocols, reducing the reliance of the QRF model on the CHaMP habitat data.
Below is the rubric used to inform the covariate selection process for each of the six models (Chinook and steelhead; winter, summer, and redds)
Strength between covariate and response variable (based on MIC score)
Informative for restoration efforts (Yes/No)
Could be calculated using DASH data (Yes/No)
How much data were missing and/or the amount of “0”s?
How correlated was the covariate with other covariates in the original QRF model, and covariates within the same model?
An oversimplified example of the theoretical covariate selection process might unfold as follows. In the original QRF model, discharge was likely included in the model because it had a high MIC score and it made biological sense. Unfortunately, discharge isn’t that informative for restoration efforts because most restoration efforts can’t create water. Discharge (like many habitat covariates) is highly correlated to other habitat covariates, but these covariates were left out of the original QRF model for any number of reasons (highly correlated with other covariates already in the model, redundant, etc.). Using the rubric, we found that average thalweg depth had a MIC score that was nearly as high as discharge, it was informative for restoration, it could be calculated with DASH, and the two covariates were highly correlated (the high correlation is likely why average thalweg depth was left out of the original QRF model). Based on all the information above, we would substitute mean thalweg depth for discharge in the model. Repeat this process for all other QRF covariates for each of the six models.
The covariate selection process was done independently for both species for all three life stages. At the end of the covariate selection process, final covariates were compared for each life-stage between the two species. While the relative importance of the final covariates selected to be in the models were slightly different between species, the final covariates themselves were nearly identical between species for all three life stages. Because of this, we consolidated the species-specific models and proceeded with one winter juvenile, summer parr, and redd model to be used for both species.
| Name | Metric Category | Juv Sum Chnk | Juv Sum Sthd | Juv Win Chnk | Juv Win Sthd | Redds Chnk | Redds Sthd | Description |
|---|---|---|---|---|---|---|---|---|
| Channel Unit Frequency | ChannelUnit | 5 | 9 | 5 | 3 | 1 | 1 | Number of channel units per 100 meters. |
| Fast NonTurbulent Frequency | ChannelUnit | 6 | 13 | – | – | 13 | 4 | Number of Fast Water Non-Turbulent channel units per 100 meters. |
| Sinuosity | Complexity | 13 | 7 | 10 | 10 | 10 | 12 | Ratio of the thalweg length to the straight line distance between the start and end points of the thalweg. |
| Wetted Channel Braidedness | Complexity | 14 | 14 | 13 | 13 | – | – | Ratio of the total length of the wetted mainstem channel plus side channels and the length of the mainstem channel. |
| Fish Cover: Some Cover | Cover | 8 | 4 | 8 | 8 | 9 | 3 | Percent of wetted area with some form of fish cover |
| Large Wood Density | Cover | – | – | 4 | 5 | – | – | Large Wood per sq meter |
| Residual Depth | Size | – | – | 2 | 2 | – | – | Average residual depth of the channel unit. |
| Average Thalweg Depth | Size | 1 | 3 | – | – | 2 | 2 | Average Thalweg Depth, meters |
| Thalweg Exit Depth Avg | Size | – | – | 6 | 7 | – | – | Depth of the thalweg at the downstream edge of the channel unit. |
| Gradient | Size | 3 | 2 | 7 | 1 | 4 | 6 | Site water surface gradient is calculated as the difference between the top of site (upstream) and bottom of site (downstream) water surface elevations divided by thalweg length. |
| Residual Pool Depth | Size | 12 | 10 | – | – | 11 | 5 | The average difference between the maximum depth and downstream end depth of all Slow Water/Pool channel units. |
| Discharge | Size | – | – | 3 | 4 | – | – | The sum of station discharge across all stations. Station discharge is calculated as depth x velocity x station increment for all stations except first and last. Station discharge for first and last station is 0.5 x station width x depth x velocity. |
| Substrate Est: Boulders | Substrate | 10 | 12 | – | – | 8 | 11 | Percent of boulders (256-4000 mm) within the wetted site area. |
| Substrate Est: Cobble and Boulder | Substrate | – | – | 11 | 11 | – | – | Total cobble plus boulder percentage |
| Substrate Est: Cobbles | Substrate | 11 | 6 | – | – | 5 | 8 | Percent of cobbles (64-256 mm) within the wetted site area. |
| Substrate Est: Coarse and Fine Gravel | Substrate | 7 | 8 | 12 | 12 | 7 | 13 | Percent of coarse and fine gravel (2-64 mm) within the wetted site area. |
| Substrate Est: Sand and Fines | Substrate | 9 | 5 | 9 | 9 | 6 | 7 | Percent of sand and fine sediment (0.01-2 mm) within the wetted site area. |
| Avg. August Temperature | Temperature | 2 | 1 | – | – | 3 | 10 | Average predicted daily August temperature from NorWest, averaged across the years 2002-2011. |
| Elevation | Temperature | – | – | 1 | 6 | – | – | Elevation, meters |
| Large Wood Frequency: Wetted | Wood | 4 | 11 | – | – | 12 | 9 | Number of large wood pieces per 100 meters within the wetted channel. |
Talk about relative importance and pdp plots.
The spatial extent of QRF capacity predictions was limited to reaches CHaMP habitat data, so capacity for all wadable streams in the Columbia basin was estimated through the development of an extrapolation model. This model used ‘globally available attributes’ (GAAs) obtained from a stream layer created by Morgan Bond and Tyler Nodine based on the National Hydrography Dataset High Resolution 1:24,000 line network to estimate capacities predicted by the QRF model at the 200 meter reach scale. The extrapolation model utilized a random forest model structure. Consistent with the QRF model, the extrapolation model made not assumptions about the direction and distribution of effects of predictors and constrained density estimates within the range of predictions from the QRF model. However, random forest methods did not account for variable strata weights across the CHaMP dataset, a source of potential bias that could be alleviated through the collection of additional paired fish and habitat data.
Extrapolation model covariates were selected from the list of GAAs and examined for inclusion by examining relative importance and partial dependence plots and correlation between covariates. We used the covariates included in the previous extrapolation as a starting point for selection. This resulted in the replacement of regime (an indicator of dominant precipitation type) for elevation and the removal of relative slope, which we found was redundant with gradient. Model results indicated that elevation was consistently one of the most important predictors in the model. This is particularly true for the Chinook parr summer model where capacity predictions were primarily driven by elevation.
| Metric | Decription |
|---|---|
| Gradient % | Stream gradient (%). |
| Relative slope | Relative slope. Reach slope minus upstream slope. |
| Sinuosity | Reach sinuosity. 1 = straight, 1 < sinuous. |
| Alpine accumulation | Number of upstream cells in alpine terrain. |
| Fines accumulation | Number of upstream cells in fine grain lithologies. |
| Flow accumulation | Number of upstream DEM cells flowing into reach. |
| Gravel accumulation | Number of upstream cells in gravel producing lithologies. |
| Precipitation accumulation | Number of upstream cells weighted by average annual precipitation. |
| Floodplain width | Current unmodified floodplain width. |
| Avg Aug stream temperature | Historical composite scenario representing 10 year average August mean stream temperatures for 2002-2011 (Isaak et al. 2017). |
| Disturbance PCA 1 | Disturbance Classification PCA 1 Score (Whittier et al. 2011). |
| Natural PCA 1 | Natural Classification PCA 1 Score (Whittier et al. 2011). |
| Natural PCA 2 | Natural Classification PCA 2 Score (Whittier et al. 2011). |
| Elevation | Elevation at downstream end of reach |
Figure 4.1: Extrapolations of habitat capacity for Chinook salmon, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.
Figure 4.2: Extrapolations of habitat capacity for steelhead, by life-stage, for the eight watersheds within the Upper Salmon River Basin using the modified models.
| Watershed | Juv summer capacity/km | Summer SE/km | Juv winter capacity/km | Winter SE/km | Redd capacity/km | Redd SE/km |
|---|---|---|---|---|---|---|
| EF Salmon | 12,335 | 1,452.9 | 885 | 210.5 | 3 | 0.1 |
| Lemhi | 5,766 | 459.4 | 1,038 | 112.6 | 3 | 0.1 |
| NF Salmon | 6,504 | 961.3 | 1,351 | 199.5 | 3 | 0.1 |
| Pahsimeroi | 5,146 | 357.3 | 1,689 | 189.8 | 3 | 0.1 |
| Panther Cr | 8,544 | 829.3 | 1,410 | 156.2 | 3 | 0.1 |
| Upper Salmon | 17,082 | 1,823.5 | 862 | 235.9 | 3 | 0.1 |
| Valley Cr | 15,833 | 1,726.0 | 961 | 270.8 | 3 | 0.2 |
| Yankee Fork | 14,967 | 1,916.6 | 833 | 200.9 | 3 | 0.2 |
| Watershed | Juv summer capacity | Summer SE | Juv winter capacity | Winter SE | Redd capacity | Redd SE |
|---|---|---|---|---|---|---|
| EF Salmon | 252,597 | 15,520.5 | 337,682 | 36,795 | 413 | 24 |
| Lemhi | 310,577 | 9,082.3 | 363,898 | 27,441 | 441 | 18 |
| NF Salmon | 242,471 | 18,381.8 | 313,118 | 27,955 | 323 | 22 |
| Pahsimeroi | 159,705 | 6,225.1 | 205,921 | 13,951 | 198 | 8 |
| Panther Cr | 268,476 | 13,598.0 | 339,671 | 19,946 | 317 | 15 |
| Upper Salmon | 243,548 | 14,843.6 | 310,879 | 39,013 | 452 | 32 |
| Valley Cr | 176,048 | 10,707.6 | 288,579 | 31,329 | 365 | 26 |
| Yankee Fork | 197,926 | 12,378.9 | 341,310 | 38,555 | 449 | 36 |
| Watershed | Juv summer capacity/km | Summer SE/km | Juv winter capacity/km | Winter SE/km | Redd capacity/km | Redd SE/km |
|---|---|---|---|---|---|---|
| EF Salmon | 1,525 | 93.7 | 2,039 | 222.2 | 2 | 0.1 |
| Lemhi | 1,774 | 51.9 | 2,079 | 156.8 | 3 | 0.1 |
| NF Salmon | 2,049 | 155.3 | 2,646 | 236.2 | 3 | 0.2 |
| Pahsimeroi | 1,924 | 75.0 | 2,481 | 168.1 | 2 | 0.1 |
| Panther Cr | 2,105 | 106.6 | 2,664 | 156.4 | 2 | 0.1 |
| Upper Salmon | 1,485 | 90.5 | 1,895 | 237.8 | 3 | 0.2 |
| Valley Cr | 1,465 | 89.1 | 2,401 | 260.7 | 3 | 0.2 |
| Yankee Fork | 1,249 | 78.1 | 2,154 | 243.4 | 3 | 0.2 |
Comparisons of watershed capacity estimates from the previous QRF and extrapolation model and the new revised versions reveal modest differences in most cases, with an exception of Chinook parr summer capacities in several watersheds. The substantial increases in Chinook parr summer capacity are likely due to the inclusion of elevation in the extrapolation model and range from 21 - 222% compared to the previous extrapolation.
Figure 5.1: Comparison of Chinook salmon habitat capacity estimates between revised and original model extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.
| Model | Watershed | Capacity per km | Total capacity | Capacity % change | Capacity SE |
|---|---|---|---|---|---|
| Juv summer | EF Salmon | 12,335.5 | 1,926,623 | 112 | 226,926 |
| Juv summer | Lemhi | 5,765.9 | 786,452 | 112 | 62,660 |
| Juv summer | NF Salmon | 6,503.6 | 339,275 | 13 | 50,148 |
| Juv summer | Pahsimeroi | 5,145.6 | 265,099 | 45 | 18,409 |
| Juv summer | Panther Cr | 8,543.7 | 1,219,542 | 21 | 118,369 |
| Juv summer | Upper Salmon | 17,081.6 | 3,301,286 | 163 | 352,419 |
| Juv summer | Valley Cr | 15,832.8 | 1,902,198 | 152 | 207,363 |
| Juv summer | Yankee Fork | 14,967.3 | 2,144,056 | 222 | 274,556 |
| Juv winter | EF Salmon | 884.9 | 138,214 | 0 | 32,880 |
| Juv winter | Lemhi | 1,037.5 | 141,515 | -8 | 15,359 |
| Juv winter | NF Salmon | 1,350.7 | 70,462 | 28 | 10,409 |
| Juv winter | Pahsimeroi | 1,688.7 | 86,999 | -8 | 9,781 |
| Juv winter | Panther Cr | 1,410.0 | 201,265 | 29 | 22,296 |
| Juv winter | Upper Salmon | 861.6 | 166,522 | -29 | 45,582 |
| Juv winter | Valley Cr | 961.5 | 115,517 | -12 | 32,535 |
| Juv winter | Yankee Fork | 832.8 | 119,298 | 20 | 28,783 |
| Redds | EF Salmon | 2.6 | 402 | -13 | 21 |
| Redds | Lemhi | 2.6 | 353 | 5 | 11 |
| Redds | NF Salmon | 3.2 | 166 | -5 | 8 |
| Redds | Pahsimeroi | 2.7 | 139 | 25 | 4 |
| Redds | Panther Cr | 3.1 | 448 | -4 | 17 |
| Redds | Upper Salmon | 3.0 | 575 | -20 | 29 |
| Redds | Valley Cr | 3.3 | 394 | -29 | 20 |
| Redds | Yankee Fork | 3.1 | 438 | -38 | 23 |
Figure 5.2: Comparison of steelhead habitat capacity estimates between modified and original models extrapolation, by life-stage, for the eight watersheds within the Upper Salmon River Basin.
| Model | Watershed | Capacity per km | Total capacity | Capacity % change | Capacity SE |
|---|---|---|---|---|---|
| Juv summer | EF Salmon | 1,525.4 | 252,597 | -31 | 15,521 |
| Juv summer | Lemhi | 1,774.2 | 310,577 | -15 | 9,082 |
| Juv summer | NF Salmon | 2,048.7 | 242,471 | -5 | 18,382 |
| Juv summer | Pahsimeroi | 1,924.2 | 159,705 | -18 | 6,225 |
| Juv summer | Panther Cr | 2,105.3 | 268,476 | -8 | 13,598 |
| Juv summer | Upper Salmon | 1,484.6 | 243,548 | -31 | 14,844 |
| Juv summer | Valley Cr | 1,465.0 | 176,048 | -28 | 10,708 |
| Juv summer | Yankee Fork | 1,249.4 | 197,926 | -29 | 12,379 |
| Juv winter | EF Salmon | 2,039.2 | 337,682 | -14 | 36,795 |
| Juv winter | Lemhi | 2,078.7 | 363,898 | -8 | 27,441 |
| Juv winter | NF Salmon | 2,645.6 | 313,118 | -1 | 27,955 |
| Juv winter | Pahsimeroi | 2,481.0 | 205,921 | -4 | 13,951 |
| Juv winter | Panther Cr | 2,663.6 | 339,671 | 8 | 19,946 |
| Juv winter | Upper Salmon | 1,895.1 | 310,879 | -26 | 39,013 |
| Juv winter | Valley Cr | 2,401.4 | 288,579 | -14 | 31,329 |
| Juv winter | Yankee Fork | 2,154.4 | 341,310 | -18 | 38,555 |
| Redds | EF Salmon | 2.5 | 413 | -13 | 24 |
| Redds | Lemhi | 2.5 | 441 | 10 | 18 |
| Redds | NF Salmon | 2.7 | 323 | -10 | 22 |
| Redds | Pahsimeroi | 2.4 | 198 | 2 | 8 |
| Redds | Panther Cr | 2.5 | 317 | -7 | 15 |
| Redds | Upper Salmon | 2.8 | 452 | -11 | 32 |
| Redds | Valley Cr | 3.0 | 365 | -20 | 26 |
| Redds | Yankee Fork | 2.8 | 449 | -25 | 36 |